AD  N0.^= ADA048218 

DDC  file  copy 


HARMONIC  REGRESSION* 


F.  J.  Anscombe,  Yale  University 

[Invited  address,  October  26,  1977,  to  the  Third  ERDA  Statistical  Symposium, 
at  Battelle  Pacific  Northwest  Laboratories,  Richland,  WA] 

The  calculations  of  ordinary  regression  analysis  — linear  regres- 
sion by  the  method  of  least  squares  — have  been  correctly  doable  for  a 
century  and  a half.  There  have  been  changes  in  computational  methods 
used.  There  is  plenty  to  discuss  about  regression  — is  it  appropriate 
for  the  data,  what  do  the  results  mean?  No  doubt  the  calculations  are 
sometimes  of  little  value,  but  sometimes  they  are  appropriate  and  lead 
to  new  understanding. 

Regression  analysis  of  time  series  has  a much  shorter  history. 

There  is  a good  deal  of  literature  about  it,  but  the  literature  often 
has  the  air  of  arm-chair  meditation  by  a non-participant.  My  concern 
has  been  to  Implement  principles  that  are  in  the  literature,  and  devise 
a working  procedure.  Various  practical  difficulties  have  been  encountered 
that  do  not  seem  to  be  discussed  in  the  literature. 

Does  anyone  need  to  do  regression  analysis  of  time  series?  Con- 
flicting opinions  are  heard.  Huge  amounts  of  time-series  material  are 
being  collected  and  stored  relating  to  the  environment  (weather,  pollution), 
the  observations  being  made  dally  or  even  more  frequently.  Many  economic 
series  are  developed  for  monthly  or  weekly  or  daily  activities.  I have 
worked  with  annual  series,  which  are  probably  the  least  satisfactory 
material  for  this  kind  of  study. 


*Prepared  in  connection  with  research  supported  by  the  Army,  Navy,  Air 
Force  and  NASA  under  a contract  administered  by  the  Office  of  Naval  Research. 


-2- 


Sorat;  broad  generalities  are  presented  below,  and  an  example  is  shown. 
The  details  are  vital,  but  as  they  have  been  fully  described  elsewhere 
they  are  not  given  here.  (See  the  references;  the  detailed  presentation 
Is  referred  to  as  Chap.  10.) 

Formulation 

We  consider  regression  of  one  "dependent"  variable  on  just  one 
"independent"  or  predictor  variable.  (Methods  extend,  of  course,  but 
not  without  some  difficulties,  to  several  predictor  variables.)  All 
means  will  be  supposed  zero.  Then  ordinary  linear  regression  can  be 
formulated  thus.  We  are  given  observations  on  pairs  of  variables, 

(Xi,  y^)  for  i ■ 1,  2,  ...,  m . We  suppose  that  for  all  i 

^i  " ^ 

where  the  errors  considered  to  be  (in  some  sense)  Independent 

of  each  other  and  of  the  predictor  variable  {x^^}  . The  method  of  least 
squares  can  be  equated  to  the  method  of  maximum  likelihood  when  we  sup- 
pose that  the  {e^}  are  independent  random  variables  identically  dis- 
tributed N(0,  o^)  . 

How  should  regression  of  time  series  be  formulated?  We  are  given 
series  {x^}  , {y^}  , where  t ■ 1,  2,  ...,  n . We  shall  not  suppose 

these  series,  nor  the  error  series  when  we  introduce  it,  to  con- 

sist of  Independent  elements.  We  shall  Instead  suppose  the  series  to 
be  stationary,  that  is,  realizations  of  some  kind  of  stationary  stochastic 
process.  (In  practice  the  appearance  of  stationarlty  with  zero  mean  is 
encouraged  by  subtracting  a linear  or  other  trend,  usually  after  taking 
logarithms.)  To  correspond  to  (1),  one  might  suggest 


y,.  - 0 + Ct  . 


But  if  the  series  are  related,  the  relation  may  be  not  simultaneous  |)|SJP!3>jT 

'Di?i 

One  might  have 

^ ^ S • ft 

for  some  Integer  lag  j . But  then  one  might  as  well  postulate 

^ • '^1  "t-j  * 't  • 

where  j runs  over  some  suitable  set  of  Integer  values.  This  seems  to 
be  the  appropriate  formulation  for  stationary  processes,  to  correspond 
to  (1)  for  independent  processes.  The  first  member  of  the  right  side 
of  (2)  represents  a linear  filtering  of  {x^}  . 

There  are  two  main  approaches  to  trying  to  estimate  the  parameters 
{6j}  of  the  filter  in  (2). 

Time-domain  methods 

One  can  try  direct  multiple  regression  of  {y^}  on  {x^}  and  on 
lagged  versions  of  it,  {x^  for  various  j . There  is  a difficulty 
about  deciding  how  many  lags  should  be  considered.  If  {x^}  is  strongly 
autocorrelated,  conditioning  will  be  poor.  An  accurate  representation 
of  the  relation  between  two  stationary  stochastic  processes  could  easily 
involve  a large  number  of  nonzero  coefficients  • 

If  our  reason  for  trying  to  fit  a relation  like  (2)  is  to  be  able 
to  forecast  y^  from  past  values  of  ^x^}  , possibly  a very  crude  esti- 
mate of  the  will  be  good  enough.  The  precision  of  a forecast  is 

limited  by  the  variance  of  the  error  term.  The  greater  precision  that 
would  be  attained  if  the  were  known  exactly  may  be  only  negligibly 

greater  (Cleveland,  1967).  Box  and  Jenkins  (1970,  1976)  have  presented 


ACI/SS  '■» 

/ 

N’IS 

DDC 

i'NANM'"'  : 

,r 

-1 

1 

'' 1 

BY 

oisiPi3'j: 

ii'l  ••  M ' 

"Ol?! 

vL 

1 

1 

-4- 


l 


i 

I 


I 

I 

1 

i 


i 


a set  of  practical  procedures  for  estimating  the  structures  of  time  series 
well  enough  for  forecasting.  If  our  purpose  Is  not  forecasting,  but  under- 
standing as  well  as  we  can  the  relation  between  the  series,  the  Box- 
Jenklns  methods  may  be  less  satisfactory. 

It  will  be  argued  that  some  of  these  difficulties  are  mitigated  or 
avoided  by  frequency-domain  methods.  However,  we  must  usually  be  alert 
to  temporal  Instability  or  change  in  a relation  like  (2),  and  that  will 
be  detected  by  time-domain  methods. 


Frequency-domain  methods 

The  Idea  is  to  Fourier-transform  the  relation  (2),  and  estimate  the 
transform  of  will  be  suggested  that  (1)  this  procedure  is 

easier  to  carry  out  than  multiple  regression  in  the  time  domain,  and 
that  (il)  the  results  are  easier  to  understand.  Claim  (i)  derives  from 
the  fact  that  the  first  member  on  the  right  side  of  (2),  the  filtering 
of  {x^},  is  a convolution  of  {6j}  and  {x^},  and  transforms  to  the 
product  of  the  separate  transforms  of  {6^}  and  {x^}  . Thus  (2) 
becomes 

FT{y^}  - (FT{6j})-(FT{Xj.})  + FT{e^}  . 


These  Fourier  transforms  are  complex-valued  functions  of  a real 
variable  X representing  frequency.  Consider  a narrow  frequency  band 
(Interval  for  X).  Suppose  that  in  this  interval  the  transform  of  {6j} 
were  (near  enough)  constant.  Then  in  this  Interval  the  relation  between 
fT{y^}  and  FT{x^}  would  be  exactly  like  relation  (1)  above  between 
j {y^}  and  {x^},  with  the  exception  that  the  variables  and  the  regres- 

sion coefficient  are  complex-valued.  In  real  terms,  FT{6j}  is  con- 


I i 


i 


-5- 


veniently  expressed  as  an  amplitude,  the  gain  function  G(A)  , and  an 
angle,  the  phase-shift  function  <ti(X)  . Thus  if  G(A)  and  i)>(X) 
could  be  regarded  as  constant  over  the  frequency  band,  they  could  be 
estimated  from  the  transforms  of  ^ slight  modifica- 

tion of  the  usual  procedure  for  the  linear  regression  relation  (1)  — 
expressed  in  real  terms  it  looks  a bit  different,  but  the  procedure  is 
really  ordinary  linear  least  squares  with  two  real  coefficients  to  be 
estimated.  The  least-squares  procedure  is  particularly  appropriate  if 
the  error  process  is  a stationary  Gaussian  process  whose  spectral 

density  is  nearly  constant  over  the  band. 

However,  it  has  been  generally  recognized  (Akalke  and  Yamanouchi, 
1962,  Cleveland  and  Parzen,  1975,  and  other  writers)  that  treating 
(>(X)  as  constant  is  not  satisfactory  when  its  derivative  is  much  dif- 


ferent from  0,  and  that  it  is  better  to  approximate  the  behavior  of  the 
transform  of  {6^}  in  the  narrow  frequency  band  by  three  real  parameters, 
the  average  values  of  G(X)  , <^(X)  and  <)>'(X)  in  the  band  — that  is. 


treat  G(X)  as  constant  and  41  (X)  as  linear  in  X . Now  the  regression 


procedure  is  further  modified,  becoming  in  fact  nonlinear  and  requiring 
an  iterative  solution,  but  still  computationally  rather  easy. 

Thus  the  complete  procedure  Involves  examining  the  frequency  range 
of  X in  bands,  using  a moving  "window",  and  in  each  band  doing  a small 
computation  to  determine  three  real  parameters,  representing  average 
values  of  G(X)  , 4(X)  , ()>'(X)  . Upon  putting  the  solutions  together 
we  see  the  whole  behavior  of  G(X)  and  <4(X)  . With  G(X)  and  ^(X) 


-6- 


i 

i 

I 


Intelligibility 

Claim  (11)  Is  that  G(X)  and  (^(A)  are  what  we  need,  In  order  to 
understand  the  relation  between  {y^}  and  {x^}  , rather  than  the 

. If  the  latter  were  given,  we  should  have  to  Fourier-transform 
them  to  see  qualitatively  the  effect  of  the  filter.  Compare  with  the 
usual  commercial  description  of  performance  of  an  amplifier  In  a sound- 
reproduction  system. 


Example 

As  an  example  of  methods,  we  try  interrelating  an  annual  series  of 
total  copper-mine  output  for  the  U.S.  and  two  economic  annual  series, 
one  giving  the  New  York  price  of  copper,  the  other  the  total  dollar 
value  of  imports  of  merchandise  into  the  U.S.  The  copper  price  series 
is  thought  to  reflect  the  world  supply  and  demand  for  copper.  Changes 
in  price  might  be  expected  to  lead  to  similar  changes  in  production, 
possibly  a little  later.  The  imports  series  is  taken  as  an  indicator 
of  the  U.S.  economy.  The  copper  production  series  is  N233,  and  the 
price  series  is  N241,  in  Historical  Statistics  of  the  United  States  (1975); 
the  production  figures  run  from  1845  to  1970,  the  prices  from  1850  to 
1970.  The  figures  have  been  taken  exactly  as  published,  except  that  to 
smooch  a change  in  price  definition  in  1968  the  average  of  two  defini- 
tions has  been  used  for  1967.  The  price  figures  for  1850-1859  are  of 
uncertain  meaning,  and  the  production  figures  for  before  1860  show  a 

more  rapid  proportional  rate  of  growth  than  later.  For  present  purposes 

« 

it  has  seemed  wise  to  ignore  the  pre-1860  data.  Continuation  of  the 
series  from  1970  to  1975  has  been  obtained  from  the  Statistical  Abstract 
of  the  United  States  (editions  for  1975  and  1976).  The  two  series  are 


-7- 


I 

): 

!■ 


reproduced  in  Figure  1,  except  that  the  last  two  digits  of  the  production 
entries  have  been  dropped  for  ease  of  reading.  The  imports  series  has 
been  given  in  Chap.  10  and  is  not  reproduced  here;  only  the  portion  from 
1860  to  1975  is  used. 

Figure  2 shows  a plot  against  the  date  of  the  logarithm  of  the  pro- 
duction series,  with  linear  regression  on  date  subtracted.  Figure  3 is 
a similar  plot  for  the  price  series.  A plot  for  the  imports  series  has 
been  given  in  Chap.  10. 

The  three  given  series  have  each  116  entries  (for  1860-1975).  To 
prepare  them  for  Fourier  analysis  they  have  been  prewhitened  by  the 
steps:  (i)  take  logarithms,  (ii)  subtract  the  linear  regression  on 

date,  (Hi)  filter  by  the  two-point  filter  with  weights  (-0.9,  1).  The 
last  operation  reduces  the  length  of  each  series  to  115.  Then  the  series 
have  been  circularized  (tapered)  by  linearly  splicing  the  first  7 and 
the  last  7 entries,  so  that  the  length  of  each  series  becomes  108.  The 
Fourier  transform  is  made  at  frequencies  (0,  1,  2,  ...,  54)/108  cycles 
per  year;  the  transform  is  expressed  as  a set  of  (real)  coefficients  of 
cosine  and  sine  terms,  or  alternatively  as  a set  of  squared  amplitudes 
and  phase  angles.  The  frequencies  are  referred  to  as  harmonics,  num- 
bered 0 through  54. 

The  first  step  to  perceiving  an  interrelation  between  any  pair  of 
series  is  to  plot  the  difference  of  phase  angles  at  each  harmonic  against 
the  harmonic  number.  Figure  4 shows  this  for  the  production  and  price 
series,  and  Figure  5 for  the  production  and  imports  series.  At  each 
harmonic,  the  product  of  the  amplitudes  is  classified  by  size  into  one 
of  six  categories  and  represented  by  one  of  the  plotting  symbols: 


O 9 □ SI 


-8- 


i 


Each  phase  difference  is  plotted  with  this  symbol  twice  over  in  the 
interval  from  0 to  8 rightangles.  In  looking  for  trends,  the  viewer’s 
eye  should  be  guided  by  the  heavier  symbols. 

Figure  5 shows  a fairly  strong  relation  between  production  and 
Imports,  especially  at  the  higher  frequencies  — the  phase  differences 
are  mostly  rather  close  to  4 (or  0 or  8)  rightangles,  and  show  no  trend 
with  frequency.  A simultaneous  positive  correlation  between  these  two 
series  is  indicated.  Figure  4 shows  a less  clear  relation  between  pro- 
duction and  price.  At  lower  frequencies  there  is  some  suggestion  of 
trend  in  the  phase  differences,  implying  that  production  follows  price, 
possibly  by  two  years,  possibly  by  four.  At  higher  frequencies  the 
phase  differences  seem  very  scattered.  Not  reproduced  is  a phase  dif- 
ference plot  for  the  price  and  imports  series,  suggesting  quite  a strong 
simultaneous  correlation  at  the  lower  frequencies,  and  not  much  at  higher 
frequencies. 

Now  the  regression  calculation  in  frequency  bands,  to  estimate 
G(X)  , i}i(X)  and  ((>'(X)  , can  be  performed.  The  window  chosen  is  23 
harmonics  wide,  and  sine  weights  have  been  used.  The  results  are  tabu- 
lated in  Figure  6.  The  first  column  lists  the  harmonic  number  of  the 
central  frequency  in  the  band;  we  have  stepped  the  central  harmonic 
number  from  the  lowest  possible  value,  11,  by  unit  steps  to  the  greatest 
possible  value,  43.  (Had  there  been  many  more  harmonics  and  a greater 
bandwidth,  greater  steps  would  have  been  convenient.)  The  next  three 
columns  list  estimates  of  spectral  density  for,  respectively,  copper  pro- 
duction, copper  price  and  Imports  (prewhitened  as  explained  above), 
obtained  from  the  raw  line  spectra  by  the  23-point  sine-weighted  moving 
average.  The  next  four  columns  refer  to  regression  of  copper  production 


-9- 


on  copper  price.  They  list  average  values  in  the  band  of  G(A)  , <p(X) 

. 2 

and  (p  (X)  , and  (in  the  fourth  of  these  columns)  multiple  R (the 

2 

coherency).  The  behavior  of  4>(A)  and  of  R gives  a numerical  measure 

of  the  trend  seen  in  Figure  4.  The  last  four  columns  of  Figure  6 give 

similar  information  for  regression  of  copper  production  on  imports,  and 

relate  to  Figure  5.  (Simultaneous  regression  of  copper  production  on 

both  copper  price  and  imports  is  not  considered  at  this  point.) 

To  test  a null  hypothesis  of  no  association  between  series,  5%,  1% 

2 

and  0.1%  values  for  R for  any  given  frequency  band  are  estimated  (by 
a crude  argument)  at  0.19,  0.27  and  0.36,  respectively  — these  values 
probably  err  in  being  a little  too  low.  The  tabulated  values  are  very 
highly  correlated,  as  one  reads  down  the  column.  So  for  regression  of 
production  on  price,  it  seems  reasonable  to  claim  a substantial  correla- 
tion at  low  frequencies,  in  the  bands  centered  between  the  11th  and 
19th  harmonics.  For  regression  of  production  on  Imports,  the  correla- 
tion is  substantial  in  bands  centered  between  the  23rd  and  43rd  har- 
2 

monies  — R is  close  to  0.5  in  many  of  these  bands. 

Of  our  two  predictor  variables,  copper  price  and  general  Imports, 
the  latter  has  on  the  whole  the  greater  correlation  with  copper  produc- 
tion. But  the  two  predictor  series  have  some  correlation  with  each 
other.  How  useful  is  the  price  series  as  a predictor  in  conjunction 
with  the  imports  series?  Residual  Fourier  transforms  of  the  production 
series  and  of  the  price  series,  after  regression  on  the  Imports  series, 
can  be  obtained,  and  a phase-difference  plot  can  be  made,  analogous  to 
Figure  4 for  the  original  Fourier  transforms.  This  is  shown  in  Figure  7. 
The  phase  trend  seems  rather  similar  to  that  in  Figure  4 at  lower  fre- 
quencies, and  weaker  at  higher  frequencies. 


-•  0- 


Flgure  8 shows  a calculation  like  that  in  Figure  6,  but  relating 

to  simultaneous  regression  of  production  on  both  price  and  imports, 

2 

instead  of  to  separate  regressions.  The  R in  the  final  column  is 

2 

always  greater  than  either  value  of  R (for  the  same  frequency  band) 

2 

given  in  Figure  6.  The  most  striking  increase  over  the  R for  regres- 
sion on  imports  only  occurs  for  bands  centered  between  the  25th  and  28th 
harmonics  — 0.479  Instead  of  0.332  at  the  25th  harmonic,  0.511  instead 
of  0.359  at  the  26th  harmonic,  etc.  The  same  sort  of  crude  argument  as 
before  indicates  that  these  four  increases  (but  none  of  the  others)  can 
be  regarded  as  significant  at  the  5%  level.  The  increases,  on  the 
whole,  are  larger  at  lower  frequencies  than  at  higher  frequencies. 

The  two  phase-shift  functions  estimated  in  Figure  8 can  be  fairly 
well  approximated  at  most  frequencies  by  saying  that  production  is 
correlated  positively  with  imports  of  the  same  year  and  negatively  with 
prices  of  four  years  before. 

Figures  9 and  10  are  time-domain  plots  Intended  to  show  whether  the 
relations  between  the  series  perceived  in  the  harmonic  analysis  pervade 
the  whole  series  or  are  special  to  particular  epochs.  For  both  plots, 
the  original  series  have  been  transformed  to  logarithms  and  a linear 
trend  has  been  subtracted.  Then  for  Figure  9,  low  frequencies  have 
been  suppressed  by  taking  the  second  difference  of  the  series,  and  the 
resulting  production  values  are  plotted  against  the  imports  values.  The 
correlation  coefficient  is  0.60.  The  decade  of  each  plotted  point  is 
shown  by  the  letters  appearing  on  the  right  side  of  Figure  1;  a star 
means  that  two  or  more  points  have  coincided.  For  Figure  10,  the  spectra 
have  been  roughly  whitened  by  taking  the  first  difference  of  each 
series,  and  then  high  frequencies  have  been  suppressed  by  three  simple 


2-point  averagings.  The  first  4 values  of  the  resulting  production  series 
and  imports  series  have  been  dropped,  and  the  last  4 values  of  the  result- 
ing price  series;  and  then  the  production  values  are  plotted  against  the 
linear  combination  of  the  Imports  values  (for  the  same  year)  minus  0,8 
times  the  price  values  (for  four  years  earlier).  The  correlation  coef- 
ficient is  0.53.  The  decade  of  the  production  values  is  shown  as  before. 

The  pronounced  correlation  in  both  Figures  9 and  10  is  due  to  a few 
extreme  points  labeled  G or  H , representing  the  two  decades  from 
1920  to  1939,  If  all  points  for  these  decades  were  omitted,  the  correla- 
tion would  become  0.04  for  Figure  9 and  -0.08  for  Figure  10,  That  is, 
the  correlation  would  disappear. 

Discussion 

Harmonic  regression  is  a systematic  way  of  looking  for  association 
between  series  in  all  parts  of  the  frequency  range.  It  is  unlikely  to 
reveal  anything  that  cannot  be  found  by  careful  visual  comparison  of 
plots  such  as  those  in  Figures  2 and  3,  at  least  when  only  two  or  three 
series  are  under  consideration.  (A  similar  remark  can  be  made  about 
ordinary  regression.) 

We  have  found  clear  evidence  of  association  between  copper  produc- 
tion and  general  imports,  at  higher  frequencies,  and  some  suggestion  of 
predictive  value  for  copper  price  also,  at  middle-to-low  frequencies. 

What  associations  there  are  seem  to  inhere  in  the  economically  turbulent 
years  of  the  20*3  and  30's.  We  do  not  see  similar  associations  in  the 
other  decades.  Possibly  relations  between  these  series  are  changing, 
possibly  the  phenomena  are  highly  nonlinear. 


-12- 


References 

The  detailed  study  on  which  this  paper  Is  based  is 

F.  J.  ANSCOMBE  (in  preparation).  Statistical  Computing  with  APL, 

Chap.  10;  "Time  series:  Yale  enrolment". 

Other  references  are 

H.  AKAIKE  and  Y.  YAMANOUCHI  (1962).  On  the  statistical  estimation  of 

frequency  response  function.  Annals  of  the  Institute  of  Statistical 
Mathematics,  14,  23-56. 

G.  E.  P.  BOX  and  G.  M.  JENKINS  (1970,  1976).  Time  Series  Analysis: 

Forecasting  and  Control.  Holden-Day. 

W.  S.  CLEVELAND  (1967).  Time  series  projection:  theory  and  practice. 

Doctoral  dissertation,  Yale  University. 

W.  S.  CLEVELAND  and  E.  PARZEN  (1975).  The  estimation  of  coherence,  fre- 
quency response,  and  envelope  delay.  Technometrics , 17,  167-172. 

U.  S.  Bureau  of  the  Census  (1975).  Historical  Statistics  of  the  United 
States,  Colonial  Times  to  1970,  Bicentennial  Edition.  U.  S.  Govern- 
ment Printing  Office. 


+ 0 

1 

2 

3 

4 

5 

6 

7 

8 

9 

U.S.  COPPER  PRODUCTION  (MINE  OUTPUT,  HUNDREDS 

OF  SHORT  TONS) 

' 

1860 

1 81 

84 

106 

95 

90 

95 

100 

112 

130 

140 

A 

1870 

1 141 

146 

140 

174 

196 

202 

213 

235 

241 

256 

B 

1880 

1 302 

358 

453 

578 

725 

829 

789 

907 

1132 

1134 

C 

1890 

1 1299 

1421 

1725 

1647 

1771 

1903 

2300 

2470 

2633 

2843 

D 

1 

1900 

1 3031 

3010 

3298 

3490 

4063 

4444 

4585 

4236 

4784 

5633 

E 

■ 

1910 

1 5441 

5574 

6245 

6178 

5742 

7440 

10029 

9477 

9550 

6062 

F 

1920 

1 6123 

2331 

4823 

7389 

8031 

8391 

8626 

8250 

9049 

9976 

G 

1930 

1 7051 

5289 

2381 

1906 

2374 

3865 

6145 

8420 

5578 

7283 

H 

1940 

1 8781 

9581 

10801 

10908 

9725 

7729 

6087 

8476 

8348 

7528 

I 

, 

1950 

1 9093 

9283 

9254 

9264 

8355 

9986 

11042 

10869 

9793 

8248 

J 

1960 

1 10802 

11652 

12284 

12132 

12468 

13517 

14292 

9541 

12046 

15446 

K 

1970 

1 17197 

15220 

16650 

17180 

15970 

14110 

L 

PRICE  OF  REFINED  COPPER  AT  NEW 

YORK  (CENTS  PER  POUND) 

1860 

1 22.88 

22.25 

21.88 

33.88 

47.00 

39.25 

34.25 

25.38 

23.00 

24.25 

A 

; 

1870 

1 21.19 

24.12 

35.56 

28.00 

22.00 

22.69 

21.00 

19.00 

16.56 

18.62 

B 

i 

1880 

1 21.50 

18.25 

18.50 

15.88 

13.75 

11.10 

11.00 

11.25 

16.80 

13.75 

C 

\ 

1890 

1 15.75 

1 

12.88 

11.50 

10.65 

9.43 

10.70 

10.92 

11.30 

12.01 

17.75 

D 

■ 

1900 

1 16.54 

16.40 

11.96 

13.62 

13.11 

15.98 

19.77 

20.86 

13.39 

13.11 

E 

t 

1910 

1 12.88 

12.55 

16.48 

15.52 

13.31 

17.47 

28.46 

29.19 

29.19 

18.90 

F 

1920 

1 17.50 

12.65 

13.56 

14.61 

13.16 

14.16 

13.95 

13.05 

14.68 

18.23 

G 

1930 

1 13.11 

8.24 

5.67 

7.15 

8.53 

8.76 

9.58 

13.27 

10.10 

11.07 

H 

1940 

1 11.40 

11.87 

11.87 

11.87 

11.87 

11.87 

13.92 

21.15 

22.20 

19.36 

I 

‘ 1 

1950 

1 21.46 

24.37 

24.37 

28.92 

29.82 

37.39 

41.88 

29.99 

26.13 

30.62 

J 

1 

1 

1 

1960 

1 32.16 

30.14 

30.82 

30.82 

32.17 

35.19 

35.82 

38.01 

41.17 

47.43 

K 

1 

1970 

1 58.07 

52.00 

51.20 

59.50 

77.30 

64.20 

L 

1 

FIGURE  1 


— — f— — — t— — — —t— 1— ♦ 


FIGURE  2 


FIGURE  U 


FIGURE  5 


-20- 


l 

1 

i 

1 


Smoothed  spectral  1 Regression  coefficients  of  copper  prodn,  with  R*2, 
estimates  I (l)  on  copper  price  (ii)  on  imports 


11 

1274 

853 

583  1 

.635 

6.10 

15.1 

.270  1 

.625 

6.26 

1.1 

.179 

12 

1272 

860 

566  1 

.666 

.01 

12.2 

.299  1 

.637 

6.26 

2.0 

.210 

13 

1252 

861 

542  1 

.695 

.15 

9.8 

.332  1 

.756 

6.26 

2.4 

.248 

14 

1259 

868 

519  1 

.692 

.25 

9.6 

.330  1 

.78" 

.02 

1.0 

.255 

15 

1258 

879 

503  1 

.686 

.36 

10.6 

.328  1 

.796 

.03 

.3 

.253 

16 

1240 

883 

486  1 

.674 

.46 

10.7 

.323  1 

.80-. 

.03 

".4 

.253 

17 

1213 

880 

472  1 

.665 

.57 

11.3 

.321  1 

.795 

.02 

".6 

.246 

18 

1166 

867 

455  1 

.652 

.69 

11.9 

.316  1 

.779 

6.28 

".5 

.237 

19 

1109 

852 

439  i 

.621 

.79 

11.5 

.296  1 

.768 

6.26 

. 0 

.233 

20 

1058 

824 

427  1 

.580 

.86 

10.5 

.262  1 

.766 

6.21 

"1.8 

.237 

21 

1014 

805 

427  1 

.525 

.84 

7.5 

.219  i 

.762 

6.18 

"3.3 

.244 

22 

972 

777 

427  1 

.487 

.70 

1.8 

.190  1 

.769 

6.15 

"3.8 

.259 

23 

936 

745 

425  1 

.473 

.60 

"1.8 

.176  1 

.790 

6.14 

"4.8 

.284 

24 

893 

719 

427  1 

.457 

.56 

"3.0 

.168  1 

.801 

6.12 

"4.9 

.307 

25 

842 

706 

439  ! 

.441 

.51 

"3.3 

.163  1 

.798 

6.08 

"4.6 

.332 

26 

782 

694 

446  1 

.409 

.48 

3 . 6 

.149  1 

.793 

6.04 

"3.9 

.359 

27 

716 

672 

471  1 

.378 

.46 

"4.2 

.134  1 

.761 

6.02 

"3.1 

.381 

28 

646 

641 

491  1 

.349 

.42 

"5.7 

.121  1 

.736 

5.99 

"2.0 

.412 

29 

617 

620 

507  1 

.355 

.42 

"10.5 

.127  1 

.717 

5.97 

.3 

.422 

30 

614 

597 

517  1 

.391 

.44 

"14.9 

.149  1 

.715 

5.94 

2.0 

.431 

31 

620 

589 

525  1 

.400 

.36 

"16.4 

.152  1 

.729 

5.90 

4.9 

.450 

32 

634 

589 

531  1 

.413 

.21 

"15.1 

.158  1 

.743 

5.93 

6.0 

.463 

33 

641 

582 

537  1 

.430 

.07 

"13.7 

.168  1 

.753 

5.98 

6.7 

.475 

34 

644 

567 

536  1 

.450 

6.22 

"12.3 

.178  1 

.766 

6.04 

7.2 

.488 

35 

642 

545 

533  1 

.477 

6.11 

"10.5 

.193  1 

.774 

6.10 

7.4 

.497 

36 

637 

515 

526  1 

.513 

6.00 

"8.7 

.213  1 

.775 

6.15 

6.7 

.496 

37 

627 

486 

512  I 

.550 

5.90 

"5.7 

.234  1 

.775 

6.19 

5.9 

.491 

38 

625 

458 

492  1 

.578 

5.83 

"5.5 

.245  1 

.781 

6.24 

5.5 

.480 

39 

630 

438 

471  1 

.600 

5.79 

"4.8 

.250  1 

.795 

6.28 

5.1 

.473 

40 

640 

432 

450  1 

.633 

5.77 

"3.5 

.270  1 

.810 

.00 

3.8 

.462 

41 

659 

425 

428  1 

.641 

5.78 

"2.4 

.266  1 

.834 

.01 

2.6 

.452 

42 

669 

421 

402  1 

.645 

5.77 

"1.6 

.262  1 

.861 

.02 

1.4 

.446 

43 

675 

417 

375  1 

.652 

5.76 

"1.2 

.262  1 

.895 

.04 

.0 

.445 

FIGURE  6 


-:-i- 

PHASE  DIFFERENCE  {RIGHT ANGLES):  PRODUCTION  RESIDUALS  AND  PRICE  RESIDUALS 
0123  u 5678 


FIGURE  7 


-22- 


Llmoothed 

spectral  1 

RQRression  coeffJ 

cients  of  c 

opper 

Tjrodn  1 

Mult. 

estimates 

1 

on  copper  price 

etna  o 

n imports  1 

R*2 

11 

1274 

853 

583  1 

.542 

5.92 

18. 

2 1 

.439 

. 34 

6.2  1 

.345 

12 

1272 

860 

566  1 

.542 

6.09 

15. 

9 1 

.424 

.33 

"4.6  1 

.359 

13 

1252 

861 

542  1 

.537 

6.26 

12. 

5 1 

.411 

.29 

"1.3  1 

.379 

14 

125S 

668 

519  1 

.520 

.12 

12. 

2 1 

.431 

.24 

"2.0  1 

.380 

15 

1258 

879 

503  1 

.515 

.26 

13. 

2 1 

.439 

.18 

"3.1  1 

.380 

16 

1240 

883 

486  1 

.504 

.39 

13. 

4 1 

.461 

.15 

"3.9  1 

.381 

17 

1213 

880 

472  1 

.506 

.52 

13. 

9 1 

.466 

.11 

"4.4  1 

.382 

18 

1166 

867 

455  1 

.505 

.66 

14. 

4 1 

.469 

.05 

"5.0  1 

.382 

19 

1109 

852 

439  1 

.482 

.78 

14. 

7 1 

.504 

6.28 

"5.3  1 

.377 

20 

1058 

824 

427  1 

.450 

.90 

15. 

6 1 

.572 

6.26 

"5.5  i 

.372 

21 

1014 

805 

427  1 

.429 

1.01 

18. 

0 1 

.660 

.00 

"6.3  1 

.378 

22 

972 

777 

427  1 

.416 

1.18 

20. 

0 1 

.734 

6.27 

"5.9  1 

.391 

23 

936 

745 

425  1 

.414 

1.35 

20. 

'!  1 

.807 

6.25 

"6.1  1 

.415 

24 

893 

719 

427  1 

.421 

1.53 

20. 

9 1 

.871 

6.20 

"5.9  1 

.444 

25 

842 

706 

439  1 

.434 

1.72 

21. 

0 1 

.919 

6.15 

"5.7  1 

.479 

26 

782 

694 

446  1 

.434 

1.89 

18. 

b 1 

.957 

6.04 

"4.5  1 

.511 

27 

716 

672 

471  1 

.425 

2.09 

17. 

9 1 

.952 

5.99 

"3.2  1 

.525 

28 

646 

641 

491  1 

.414 

2.34 

18. 

9 1 

.950 

5.98 

"1.7  1 

.548 

29 

617 

62C 

507  1 

.394 

2.81 

24. 

6 1 

.923 

6.06 

".0  1 

.539 

30 

614 

597 

517  1 

.391 

3.05 

25. 

0 1 

.925 

6.04 

1.8  1 

.537 

31 

620 

589 

525  i 

.394 

3.18 

27. 

6 1 

.943 

5.99 

4.4  1 

.554 

32 

634 

589 

531  1 

.385 

3.39 

27. 

8 1 

.942 

6.01 

5.7  1 

.563 

33 

641 

582 

537  1 

.363 

3.65 

28. 

3 1 

.925 

6.05 

6.0  1 

.567 

34 

644 

567 

536  1 

.344 

3.91 

28. 

8 1 

.913 

6.10 

6.4  1 

.572 

35 

64  2 

545 

533  1 

.325 

4.18 

29. 

0 1 

.896 

6.15 

6.3  1 

.572 

36 

637 

515 

526  1 

.302 

4.51 

29. 

5 1 

.872 

6.17 

5.2  1 

.559 

37 

627 

486 

512  1 

.234 

4.78 

30. 

7 1 

.857 

6.19 

4.5  1 

.546 

38 

625 

458 

492  1 

.275 

5.10 

30. 

9 1 

.847 

6.22 

4.1  1 

.530 

39 

630 

438 

471  1 

.271 

5.41 

23. 

8 1 

.823 

6.28 

2.8  1 

.516 

40 

640 

432 

450  1 

.315 

5.56 

64. 

2 1 

.735 

6.24 

6.0  1 

.517 

41 

659 

425 

428  1 

.352 

6.19 

62. 

5 1 

.766 

6.26 

5.4  1 

.517 

42 

669 

421 

402  1 

.359 

.47 

62. 

7 I 

.784 

.01 

4.2  1 

.513 

43 

675 

417 

375  1 

.357 

1.05 

63. 

3 1 

.804 

.05 

2.4  1 

.513 

FIGURE  8 


-23- 


K 


B 


A 

A 


IL 


I 

HF 
F 

* LH 

} F ****FE 
AJ  **G*  **D 

J hiiii-tiliii'ki'CILJ 
DJE  C*  C 
G F G I 

K F 

H 


H 


H 


-+- 

0 


Copper  production  ve.  imports:  high-frequency  filtering. 


FIGURE  9 


H 


H 


C C 

F C 

C I KK  C 
D EB  H*I  DK  * FCC  B 
BD  BA  E*  A JJD*  C B CCIK  B 


a 


H 


D J**EF  E*  J*J<B  G F E* 
G J G K FI  *L  * 

I KH  L F 


B 


II 


H G 


H 


H 


“•h- 

0 


Copper  production  vs.  Imports  and  price:  low- frequency  filtering. 


FIGURI']  10 


A 


UNCLASSIFIED 


1/ 


/ 


y 


SBCuniTV  CL*S$I»'IC*T|0N  or  This  page  'Wh»n  Oat*  emarad; 


1 REPORT  DOCUMENTATION  PAGE 

HEAD  instructions 

BEFORE  COMPLETI.NC  FORM  ’ 

t.  MCPORT  NUMIER  |2.  OOVT  ACCESSION  NO. 

J.  RCCl^ttNT'S  catalog  HUMBER 

44  1 

4.  TiTtt  MiaU) 

1.  TYPE  or  report  • period  coverec 

HARMONIC  REGRESSION 

Technical  Report 

S.  PERroRMIMO  ORC.  REPORT  NUM8ER 

7.  AuThORCoJ 

#.  contract  or  grant  NuMaERci;  ■ 

1 

Francis  J,  Anscombe 

1 

yN00014-75-C-0563  j 

1.  PERroRMiNO  organization  name  and  address 

Department  of  Statistics 

Yale  University 

New  Haven,  CT  06520 

10.  PHOO«AM  clement,  PRC-::CT.  TASK 
auca  a work  unit  numbehs 

NR  042-242 

controllino  orncE  name  and  address 

Office  of  Naval  Research 

iz.  report  date 

December,  1977 

Statistics  and  Probability  Program  (Code  436) 
Arlington,  VA  22217 

11.  NUMSER  or  PAGES 

i + 24  + ii  = 27  i 

li.  monitoring  AOENCY  name  S AOOHttSfll  dlllarani  Irani  Cantrallint  Olllea) 

1ft.  security  class,  (ol  thf  foport) 

Unclassified  ' 

ISa.  OECLASSiriCATION,  DOWNGRADING  | 

schedule  j 

U.  OitTMiauTlON  STATEMENT  (ol  lltlt  Rtpcrt) 


Approved  for  public  release;  distribution  unlimited. 

17.  DISTSiauTION  »TATeMf  NT  (al  ttia  abtuael  aniarad  In  Black  20,  II  dlllarani  tram  Babarl) 


H.  iUPPLEMENTARY  notes  “ ’ j 

Presented  at  the  Third  ERDA  Statistical  Symposium,  Richland,  WA,  on  | 

26  October  1977,  j 

To  be  published  in  Proceedings  of  the  Third  ERDA  Statistical  Symposium.  i 

<•.  KEY  WORDS  CCatifinua  on  ravataa  alda  11  naeaaaaff  and  Idanillr  by  black  ntmbae)  | 

Time  series  analysis  j 

Spectral  analysis  < 

Regression  of  time  series 
Copper  production 

to.  AtSTRACT  fConllnua  an  roaara*  alda  11  nacaaaacy  and  Idanlltr  by  black  mmbai) 

See  next  page. 


DO  1473  EDITION  or  I NOV  SI  IS  OaSOLETE 

l/N  01S2-0I4-S60I  I 


UMCLASSlflffi 

tcuaiTV  CLASSiriCATlON  or  this  pace  rtasn  !>•>•  BnfadV 


UNCLASS  IF  I I'D 

wH-ljHITV  CL  A$»lf  IC»T10N  OF  THIS  P kO^fWhmn  Dmim  Enfnd) 


Abstract  j 

^ Ordinary  linear  regression,  by  the  method  of  least  squares,  is  used  to 
determine  a linear  relation  subsisting  between  given  Independent  observa- 
tions of  two  or  more  variables.  An  analogous  problem  for  time  series  is 
to  determine  a linear  relation  subsisting  between  two  or  more  given  sta- 
tionary series.  The  linear  relation  may  take  the  form  that  one  series  is 
a linear  filtering  of  the  other  series,  plus  a stationary  error  process. 

The  coefficients  of  the  filter  can  be  determined  directly  by  multiple 
regression  in  the  time  domain,  but  there  are  difficulties.  An  easier  pro- 
cedure, leading  to  more  intelligible  results,  is  to  estimate  the  Fourier 
transform  of  the  coefficients  of  the  filter  for  each  predictor  series 
(conveniently  expressed  as  a gain  function  and  a phase-shift  function)  by 
simpler  regression  calculations  in  the  frequency  domain.  The  procedure 
is  illustrated  by  a study  of  interrelationship  of  an  annual  series  of  i 

output  of  U.S,  copper  mines  froip  1860  to  1975,  and  two  annual  economic 
series  relating  to  the  same  time  period,  namely  a series  of  copper  prices 
at  New  York  and  a series  of  total  dollar  value  of  general  imports  of 
merchandise  into  the  U.S. 


I 


